Data Compression for Bitmap Indexes
نویسنده
چکیده
Compression Ratio (CR) and Logical Operation Time (LOT) are two major measures of the efficiency of bitmap indexing. Previous works by [5, 9, 10, 11] compare the performance of bitmap compression schemes conducted separately on logical operation time and compression ratio. This paper will describe these works and recommend for consideration a new matrix – overall efficiency indicator. The overall efficiency indicator is an integrative approach to evaluating the efficiency of bitmap compression schemes, taking both the compression ratio and the logical operation time into account. A further investigation is conducted to examine the effect of bitmap index density upon an overall efficiency indicator, thus shedding light on the selection of compression schemes depending on the bitmap index density. The finding shows that WAH is the most efficient compression scheme among gzip, BBC and WAH within a certain range of bitmap index density (0.0001 to 0.5). This finding is consistent with the proposal in previous studies of bitmap index compression in [9, 10, 11].
منابع مشابه
Performance of Multi-Level and Multi-Component Compressed Bitmap Indexes
Bitmap indexes are known as the most effective indexing methods for range queries on append-only data, especially for low cardinality attributes. Recently, bitmap indexes were also shown to be just as effective for high cardinality attributes when certain compression methods are applied. There are many different bitmap indexes in the literature but no definite comparison among them has been mad...
متن کاملSBH: Super byte-aligned hybrid bitmap compression
Bitmap indexes are commonly used in data warehousing applications such as on-line analytic processing (OLAP). Storing the bitmaps in compressed form has been shown to be effective not only for low cardinality attributes, as conventional wisdom would suggest, but also for high cardinality attributes. Compressed bitmap indexes, such as Byte-aligned been shown to be efficient in terms of both time...
متن کاملCompressing Bitmap Indexes for Faster Search Operations
In this paper, we study the effects of compression on bitmap indexes. Themain operations on the bitmaps during query processing are bitwise logical operations such as AND,OR,NOT, etc.Using the general purpose compression schemes, such as gzip, the logical operations on the compressed bitmaps are much slower than on the uncompressed bitmaps. Specialized compression schemes, like the byte-aligned...
متن کاملBetter bitmap performance with Roaring bitmaps
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...
متن کاملReordering Columns for Smaller Indexes
Column-oriented indexes—such as projection or bitmap indexes—are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right order before sorting can reduce the number of runs by a factor of two or more. For many cases, we prove that the number of runs in table columns is minimized if...
متن کامل